Constraint Based Hybrid Approach to Parsing Indian Languages
نویسندگان
چکیده
The paper describes the overall design of a new two stage constraint based hybrid approach to dependency parsing. We define the two stages and show how different grammatical construct are parsed at appropriate stages. This division leads to selective identification and resolution of specific dependency relations at the two stages. Furthermore, we show how the use of hard constraints and soft constraints helps us build an efficient and robust hybrid parser. Finally, we evaluate the implemented parser on Hindi and compare the results with that of two data driven dependency parsers.
منابع مشابه
A Text Chunker and Hybrid POS Tagger for Indian Languages
Part-of-Speech (POS) tagging can be described as a task of doing automatic annotation of syntactic categories for each word in a text document. This paper presents a generic hybrid POS tagger for Indian languages. Indian languages are relatively free word order, morphologically productive and agglutinative languages. In this hybrid implementation we have used combination of statistical approach...
متن کاملParsing Languages with a Configurator
Recent evolution of linguistic theories heavily rely upon the concept of constraint. Also, several authors have pointed out the similitude existing between the categories of feature-based theories and the notions of objects or frames. We show that a generalization of constraint programs called configuration programs can be applied to natural language parsing. We propose here a systematic transl...
متن کاملAn Affinity Based Greedy Approach towards Chunking for Indian Languages
A robust chunker can drastically reduce the complexity of parsing of natural language text. Chunking for Indian languages require a novel approach because of the relatively unrestricted order of words within a word group. A computational framework for chunking based on valency theory and feature structures has been described here. The paper also draws an analogy of chunk formation in free word ...
متن کاملMorphological Analyzer for Gujarati using Paradigm based approach with Knowledge based and Statistical Methods
Morphological Analyzer is a tool which performs syntactic analysis of a word and finds root form of input inflected word form. Morph analyzer serves as a pre-processing tool for many NLP applications. Significant amount of work has been done in this area for many Indian languages but not much work has been reported for Gujarati language. We present Morph analyzer for Gujarati language. The Morp...
متن کاملCost Effective Dependency Parsing for Indian Languages
Indian languages are MoR-FWO1 and hence differ from English in structure and morphology. There are many distinguished characteristics possessed by Indian languages. While working with these languages we have to keep in mind, these characteristics and plan strategies accordingly. We worked on improving Dependency Parsing for Indian Languages, more specifically for Hindi, an Indo-Aryan Language. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009